Model Comparison

We compare three models:

See vignette ‘ModelOverview’ to inspect the different models.

LOO

B

FALSE 
FALSE Computed from 4000 by 2001 log-likelihood matrix
FALSE 
FALSE          Estimate    SE
FALSE elpd_loo  -5885.3  82.5
FALSE p_loo      1309.2  27.5
FALSE looic     11770.6 165.0
FALSE ------
FALSE Monte Carlo SE of elpd_loo is NA.
FALSE 
FALSE Pareto k diagnostic values:
FALSE                          Count Pct.    Min. n_eff
FALSE (-Inf, 0.5]   (good)     511   25.5%   248       
FALSE  (0.5, 0.7]   (ok)       311   15.5%   85        
FALSE    (0.7, 1]   (bad)      901   45.0%   19        
FALSE    (1, Inf)   (very bad) 278   13.9%   3         
FALSE See help('pareto-k-diagnostic') for details.

BB

FALSE 
FALSE Computed from 4000 by 2001 log-likelihood matrix
FALSE 
FALSE          Estimate    SE
FALSE elpd_loo  -6903.7 103.1
FALSE p_loo       356.2  19.0
FALSE looic     13807.4 206.2
FALSE ------
FALSE Monte Carlo SE of elpd_loo is NA.
FALSE 
FALSE Pareto k diagnostic values:
FALSE                          Count Pct.    Min. n_eff
FALSE (-Inf, 0.5]   (good)     1683  84.1%   502       
FALSE  (0.5, 0.7]   (ok)        246  12.3%   111       
FALSE    (0.7, 1]   (bad)        68   3.4%   28        
FALSE    (1, Inf)   (very bad)    4   0.2%   4         
FALSE See help('pareto-k-diagnostic') for details.

ZIBB

FALSE 
FALSE Computed from 4000 by 2001 log-likelihood matrix
FALSE 
FALSE          Estimate    SE
FALSE elpd_loo  -6686.3 101.9
FALSE p_loo       527.4  24.1
FALSE looic     13372.6 203.7
FALSE ------
FALSE Monte Carlo SE of elpd_loo is NA.
FALSE 
FALSE Pareto k diagnostic values:
FALSE                          Count Pct.    Min. n_eff
FALSE (-Inf, 0.5]   (good)     1382  69.1%   370       
FALSE  (0.5, 0.7]   (ok)        396  19.8%   117       
FALSE    (0.7, 1]   (bad)       202  10.1%   21        
FALSE    (1, Inf)   (very bad)   21   1.0%   2         
FALSE See help('pareto-k-diagnostic') for details.

Compare all three models

FALSE                                          elpd_diff se_diff elpd_loo
FALSE loo::loo(loo::extract_log_lik(B$glm))        0.0       0.0 -5885.3 
FALSE loo::loo(loo::extract_log_lik(ZIBB$glm))  -801.0      46.5 -6686.3 
FALSE loo::loo(loo::extract_log_lik(BB$glm))   -1018.4      35.9 -6903.7 
FALSE                                          p_loo   looic  
FALSE loo::loo(loo::extract_log_lik(B$glm))     1309.2 11770.6
FALSE loo::loo(loo::extract_log_lik(ZIBB$glm))   527.4 13372.6
FALSE loo::loo(loo::extract_log_lik(BB$glm))     356.2 13807.4

Posterior predictive checks

Prediction of gene usage within repertoires [count]

  • Usage in raw counts
  • Error bars represent 95% HDI

Prediction of gene usage within repertoires [%]

  • Usage in %
  • Error bars represent 95% HDI

Prediction error at a repertoire level

  • e[%] = |Yhat[%] - Y[%| or
  • e[raw count] = |Yhat[count] - Y[count]|

Prediction of overal gene usage [%]

  • Error bars represent 95% HDI

Comparison of coefficients for differential gene usage

Comparisons:

  • ZIBB vs BB
  • ZIBB vs B
  • BB vs B

Five genes for which the pairwise models inferred most discrepant usage coefficients are annotated.

Reality Check